Learning Local Invariant Mahalanobis Distances

نویسندگان

  • Ethan Fetaya
  • Shimon Ullman
چکیده

For many tasks and data types, there are natural transformations to which the data should be invariant or insensitive. For instance, in visual recognition, natural images should be insensitive to rotation and translation. This requirement and its implications have been important in many machine learning applications, and tolerance for image transformations was primarily achieved by using robust feature vectors. In this paper we propose a novel and computationally efficient way to learn a local Mahalanobis metric per datum, and show how we can learn a local invariant metric to any transformation in order to improve performance. Metric learning is a machine learning task which learns a distance metric d(x, y) between data points, based on data instances. As distances play an important role in many machine learning algorithms, e.g. k-Nearest Neighbor and k-Means clustering, finding an appropriate metric for the task can improve performance considerably. This approach has been applied successfully to many problems such as face identification [11], image retrieval [12, 6], ranking [16] and clustering [22] to name just a few. A standard approach to metric learning is to learn a global Mahalanobis metric d(x, y)M = (x− y)M(x− y) (1) Where M is a positive semi-definite matrix (PSD). The PSD constraint only assures this is a pseudometric , but for simplicity we will not make this distinction. Various algorithms [21, 1, 9] differ by the objective through which they learn the matrix M from the data. As M is a PSD matrix, it can be written as M = LL and therefore d(x, y)M = (x− y)M(x− y) = ||x̃− ỹ||2 x̃ = Lx, ỹ = Ly. This means that finding an optimal Mahalanobis distance is equivalent to finding the optimal linear transformation on the data, and then using L2 distance on the transformed data. This approach has two limitations, first it is limited to linear transformation. Second, it requires a large amount of labeled data. One approach that can be used to overcome the first limitation is to use local distances [10] where we learn a unique distance function per training datum. Local approaches do not produce, in general, a global metric (as they are usually not symmetrical) but are commonly considered metric learning nonetheless. These methods, in 1 ar X iv :1 50 2. 01 17 6v 1 [ cs .L G ] 4 F eb 2 01 5 general, need similar and dissimilar training data for each local metric. In our current work we use a local approach inspired by the work on exemplar-SVM [18], that showed that using only negative examples can suffice for good performance. The intuition behind this is that objects of the same class do not necessarily have to be similar, but objects from different classes must be dissimilar. We will show how to learn a local Mahalanobis distance that for each datum tries to keep the non-class as far away as possible. This approach can use a large amount of weakly supervised data, as in many cases negative examples are easier then positive examples to acquire. For example, if we are interested in face identification, we can learn a local metric around a query face image given a bank of train face images, which we only assume do not belong to the queried person. Unlike other metric learning methods, we will not need any labels on which image belongs to which person in the negative set. The intuition why Mahalanobis distances are the natural model for local metrics is simple. Assume we have some metric d(x, y) on the dataset and assume that it is smooth (at least continuously twice differentiable). From the metric properties we know that if we fix x and look at f(y) = d(y, x) then f has a global minimum at y = x. Applying second order Tylor approximation to f around x we get d(y, x) = f(y) ≈ f(x) + (y − x)∇f(x)+ (y − x)∇f(x)(y − x) = (y − x)∇f(x)(y − x) (2) The equality holds since x is the global minimum with value f(x) = d(x, x) = 0, and this also implies that ∇f(x) is positive semidefinite. While the Taylor approximation only holds for values of y close to x, as metric methods such as k-NN focus on similar objects the approximation should be good at the points of interest. This observation leads us to look for local matrices that are of the form of a Mahalanobis distance. We will first define our local Mahalanobis distance learning method as a semidefinite programming problem. We will then show how this problem can be solved efficiently without any costly matrix decompositions. This allows us to solve high dimensional problems that regular semidefinite solvers cannot handle. The second major contribution of this paper will be to show how invariant local matrices can be learned. In many cases we know there are simple transformations that our metric should not be sensitive to. For example, small translation and rotation on natural images. We know a priori that if x′ = T (x), where T is the said transformation, then d(x, x′) ≈ 0. We will show how this prior knowledge about our data can by incorporated by learning a local invariant metric. This also can be done in an efficient manner, and we will show that this improves performance in our experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Retrieval and Classification Using Local Distance Functions

(x − x)A(x − x) Mahalanobis distance: Previous work on learning metrics has focused on learning a single distance metric for all instances. One of our primary contributions is to learn a distance function for every training image. Most visual categorization approaches make use of machine learning after computing distances between images (e.g. SVM with pyramid kernel). We want to learn how to co...

متن کامل

Modeling Perceptual Color Differences by Local Metric Learning

Having perceptual differences between scene colors is key in many computer vision applications such as image segmentation or visual salient region detection. Nevertheless, most of the times, we only have access to the rendered image colors, without any means to go back to the true scene colors. The main existing approaches propose either to compute a perceptual distance between the rendered ima...

متن کامل

metricDTW: local distance metric learning in Dynamic Time Warping

We propose to learn multiple local Mahalanobis distance metrics to perform knearest neighbor (kNN) classification of temporal sequences. Temporal sequences are first aligned by dynamic time warping (DTW); given the alignment path, similarity between two sequences is measured by the DTW distance, which is computed as the accumulated distance between matched temporal point pairs along the alignme...

متن کامل

Large Margin Nearest Neighbor Classification using Curved Mahalanobis Distances

We consider the supervised classification problem of machine learning in Cayley-Klein projective geometries: We show how to learn a curved Mahalanobis metric distance corresponding to either the hyperbolic geometry or the elliptic geometry using the Large Margin Nearest Neighbor (LMNN) framework. We report on our experimental results, and further consider the case of learning a mixed curved Mah...

متن کامل

Feature Vector Similarity Based on Local Structure

Local feature matching is an essential component of many image retrieval algorithms. Euclidean and Mahalanobis distances are mostly used in order to compare two feature vectors. The first distance does not give satisfactory results in many cases and is inappropriate in the typical case where the components of the feature vector are incommensurable, whereas the second one requires training data....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015